Goto

Collaborating Authors

 ai pair programmer


Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions

Communications of the ACM

There is burgeoning interest in designing AI-based systems to assist humans in designing computing systems, including tools that automatically generate computer code. The most notable of these comes in the form of the first self-described "AI pair programmer," GitHub Copilot, a language model trained over open source GitHub code. However, code often contains bugs--and so, given the vast quantity of unvetted code that Copilot has processed, it is certain that the language model will have learned from exploitable, buggy code. This raises concerns on the security of Copilot's code contributions. In this work, we systematically investigate the prevalence and conditions that can cause GitHub Copilot to recommend insecure code.


The Impact of Generative AI on Collaborative Open-Source Software Development: Evidence from GitHub Copilot

Song, Fangchen, Agarwal, Ashish, Wen, Wen

arXiv.org Artificial Intelligence

Generative artificial intelligence (AI) has opened the possibility of automated content production, including coding in software development, which can significantly influence the participation and performance of software developers. To explore this impact, we investigate the role of GitHub Copilot, a generative AI pair programmer, on software development in open-source community, where multiple developers voluntarily collaborate on software projects. Using GitHub's dataset for open-source repositories and a generalized synthetic control method, we find that Copilot significantly enhances project-level productivity by 6.5%. Delving deeper, we dissect the key mechanisms driving this improvement. Our findings reveal a 5.5% increase in individual productivity and a 5.4% increase in participation. However, this is accompanied with a 41.6% increase in integration time, potentially due to higher coordination costs. Interestingly, we also observe the differential effects among developers. We discover that core developers achieve greater project-level productivity gains from using Copilot, benefiting more in terms of individual productivity and participation compared to peripheral developers, plausibly due to their deeper familiarity with software projects. We also find that the increase in project-level productivity is accompanied with no change in code quality. We conclude that AI pair programmers bring benefits to developers to automate and augment their code, but human developers' knowledge of software projects can enhance the benefits. In summary, our research underscores the role of AI pair programmers in impacting project-level productivity within the open-source community and suggests potential implications for the structure of open-source software projects.


Rethinking Software Engineering in the Foundation Model Era: From Task-Driven AI Copilots to Goal-Driven AI Pair Programmers

Hassan, Ahmed E., Oliva, Gustavo A., Lin, Dayi, Chen, Boyuan, Ming, Zhen, Jiang, null

arXiv.org Artificial Intelligence

The advent of Foundation Models (FMs) and AI-powered copilots has transformed the landscape of software development, offering unprecedented code completion capabilities and enhancing developer productivity. However, the current task-driven nature of these copilots falls short in addressing the broader goals and complexities inherent in software engineering (SE). In this paper, we propose a paradigm shift towards goal-driven AI-powered pair programmers that collaborate with human developers in a more holistic and context-aware manner. We envision AI pair programmers that are goal-driven, human partners, SE-aware, and self-learning. These AI partners engage in iterative, conversation-driven development processes, aligning closely with human goals and facilitating informed decision-making. We discuss the desired attributes of such AI pair programmers and outline key challenges that must be addressed to realize this vision. Ultimately, our work represents a shift from AI-augmented SE to AI-transformed SE by replacing code completion with a collaborative partnership between humans and AI that enhances both productivity and software quality.


Why Use GitHub Copilot And Copilot Labs: Practical Use Cases for the AI Pair Programmer

#artificialintelligence

Even though I didn't work at GitHub when they announced Copilot, I remember it piqued my interest. Perhaps, I was mostly excited because it was new and shiny. For me, the value of Copilot is that I spend less time stressing over syntax, which leaves more time for solving problems. More recently, GitHub Next, a team exploring the future of technology and software beyond the adjacent-possible, released Copilot Labs. This experimental VS Code sidebar enables developers to translate their code from one programming language to another and explains code snippets in plain language. These sound super cool, but when would you use them?


New Azure OpenAI Service Offers GPT-3 Natural Language Models -- Virtualization Review

#artificialintelligence

The foundational technology powering new AI coding assistants and other next-gen offerings based on natural language models is going to become an Azure cloud service. Microsoft announced the new Azure OpenAI Service during this week's Ignite 2021 tech event. It's based on GPT-3, an autoregressive language model that produces human-like text by leveraging deep learning, a machine learning construct that imitates the way people gain certain types of knowledge. GPT-3 comes from Microsoft partner OpenAI, an AI research and development company. The GPT-3 language model has been put to many uses, including no-code natural language software development in Power Apps, Microsoft's low-code development offering. Microsoft has a license to infuse GPT-3 technology into its products.



What OpenAI and GitHub's "AI pair programmer" means for the software industry

#artificialintelligence

OpenAI has once again made the headlines, this time with Copilot, an AI-powered programming tool jointly built with GitHub. Built on top of GPT-3, OpenAI's famous language model, Copilot is an autocomplete tool that provides relevant (and sometimes lengthy) suggestions as you write code. Copilot is currently available to select applicants as an extension in Visual Studio Code, the flagship programming tool of Microsoft, GitHub's parent company. While the AI-powered code generator is still a work in progress, it provides some interesting hints about the business of large language models and the future directions of the software industry. The official website of Copilot describes it as an "AI pair programmer" that suggests "whole lines or entire functions right inside your editor."


KDnuggets News 21:n25, Jul 7: Data Scientists and ML Engineers Are Luxury Employees; 5 Lessons from McKinsey That Will Make You a Better Data Scientist - KDnuggets

#artificialintelligence

Features Tutorials Opinions Tops Jobs Submit a blog Image of the week In this issue: Are Data Scientists and ML Engineers Are Luxury Employees? KDnuggets Top Blogs Reward Program will pay to the authors of top blogs each month. Reposts accepted, but original submissions get 3x the rate of reposts. Check our guidelines and submit your blog soon! Features Data Scientists and ML Engineers Are Luxury Employees, by Adrien Biarnes 5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist, by Tessa Xie Managing Your Reusable Python Code as a Data Scientist, by Matthew Mayo GitHub Copilot: Your AI pair programmer - what is all the fuss about?, by Matthew Mayo A Learning Path To Becoming a Data Scientist, by Sara Metwalli Tutorials, Overviews ROC Curve Explained, by Zolzaya Luvsandorj Predict Customer Churn (the right way) using PyCaret, by Moez Ali Semantic Search: Measuring Meaning From Jaccard to Bert, by James Briggs High-Performance Deep Learning: How to train smaller, faster, and better models - Part 3, by Gaurav Menghani Prepare Behavioral Questions for Data Science Interviews, by Zijing Zhu How to Use NVIDIA GPU Accelerated Libraries, by Kevin Vu Learning Data Science Through Social Media, by Susan Sivek From Scratch: Permutation Feature Importance for ML Interpretability, by Seth Billiau Opinions How To Transition From Data Freelancer to Data Entrepreneur (Almost Overnight), by Lillian Pierson Ethics, Fairness, and Bias in AI, by Aditya Aggarwal Top Stories, Tweets Top Stories, Jun 28 - Jul 4: 5 Lessons McKinsey Taught Me That Will Make You a Better Data Scientist, by KDnuggets Jobs See our recent jobs in AI, Analytics, Data Science, Machine Learning You can post a free short entry on KDnuggets jobs page for an industry or academic job related to AI, Big Data, Data Science, or Machine Learning, email - see details at kdnuggets.com/jobs


Your AI pair programmer

#artificialintelligence

The Ultimate help and support for the developers is here by the OpenAI and GitHub community. Developed in collaboration with OpenAI, GitHub Copilot is powered by OpenAI Codex, a new AI system created by OpenAI. OpenAI Codex has broad knowledge of how people use code and is significantly more capable than GPT-3 in code generation, in part, because it was trained on a data set that includes a much larger concentration of public source code. GitHub Copilot works with a broad set of frameworks and languages, but this technical preview works especially well for Python, JavaScript, TypeScript, Ruby and Go. So, Now people who are really confused about programming you know it will actually help you to write a neat coed and even in interviews it will definitely help you if you're going for competitive programming.


GitHub Copilot · Your AI pair programmer

#artificialintelligence

Does GitHub Copilot ever output personal data? Because GitHub Copilot was trained on publicly available code, its training set included public personal data included in that code. From our internal testing, we found it to be extremely rare that GitHub Copilot suggestions included personal data verbatim from the training set. In some cases, the model will suggest what appears to be personal data – email addresses, phone numbers, access keys, etc. – but is actually made-up information synthesized from patterns in training data. For the technical preview, we have implemented a rudimentary filter that blocks emails when shown in standard formats, but it's still possible to get the model to suggest this sort of content if you try hard enough.